Unsupervised Learning Layers for Video Analysis
نویسندگان
چکیده
This paper presents two unsupervised learning layers (UL layers) for label-free video analysis: one for fully connected layers, and the other for convolutional ones. The proposed UL layers can play two roles: they can be the cost function layer for providing global training signal; meanwhile they can be added to any regular neural network layers for providing local training signals and combined with the training signals backpropagated from upper layers for extracting both slow and fast changing features at layers of different depths. Therefore, the UL layers can be used in either pure unsupervised or semi-supervised settings. Both a closedform solution and an online learning algorithm for two UL layers are provided. Experiments with unlabeled synthetic and real-world videos demonstrated that the neural networks equipped with UL layers and trained with the proposed online learning algorithm can extract shape and motion information from video sequences of moving objects. The experiments demonstrated the potential applications of UL layers and online learning algorithm to head orientation estimation and moving object localization.
منابع مشابه
Action Change Detection in Video Based on HOG
Background and Objectives: Action recognition, as the processes of labeling an unknown action of a query video, is a challenging problem, due to the event complexity, variations in imaging conditions, and intra- and inter-individual action-variability. A number of solutions proposed to solve action recognition problem. Many of these frameworks suppose that each video sequence includes only one ...
متن کاملRecognition of Visual Events using Spatio-Temporal Information of the Video Signal
Recognition of visual events as a video analysis task has become popular in machine learning community. While the traditional approaches for detection of video events have been used for a long time, the recently evolved deep learning based methods have revolutionized this area. They have enabled event recognition systems to achieve detection rates which were not reachable by traditional approac...
متن کاملDeep Learning of Invariant Spatio-Temporal Features from Video
We present a novel hierarchical and distributed model for learning invariant spatiotemporal features from video. Our approach builds on previous deep learning methods and uses the Convolutional Restricted Boltzmann machine (CRBM) as a building block. Our model, called the Space-Time Deep Belief Network (STDBN), aggregates over both space and time in an alternating way so that higher layers capt...
متن کاملDeep Predictive Coding Networks for Video Prediction and Unsupervised Learning
While great strides have been made in using deep learning algorithms to solve supervised learning tasks, the problem of unsupervised learning — leveraging unlabeled examples to learn about the structure of a domain — remains a difficult unsolved challenge. Here, we explore prediction of future frames in a video sequence as an unsupervised learning rule for learning about the structure of the vi...
متن کاملFast analysis of scalable video for adaptive browsing interfaces
Driven by a high demand for user-centred video interfaces and recent advances in scalable video coding technology, this work introduces a novel framework for video browsing by utilising inherently hierarchical compressed-domain features of scalable video and efficient dynamic video summarisation. This approach enables instant adaptability of generated video summaries to user requirements, avail...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1705.08918 شماره
صفحات -
تاریخ انتشار 2017